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1.  INTRODUCTION: 

Fitting  a  sum  of  exponential  model  to  equispaced  data,  popularly  called  as  linear 
compartment  model,  to  analyze  a  real  data  set  is  quite  common  to  the  scientists  and 
engineers.  See  for  example  Anderson  (1983),  Seber  and  Wild  (1989),  Gallant  (1987), 
Bates  and  Watts  (1988)  and  the  references  there  in.  In  this  paper  we  consider  the  fitting 
of  the  following  Unear  compartment  model; 

y{t)  =  £  ahe0kt  +  e(i)  (1) 

ifc=l 

Here  y(t)  is  observed  at  the  time  points  ti, . . .  tn,  which  are  assumed  to  be  equidistant,  a’s 
and  fts  are  unknown  parameters,  p,  is  also  assumed  to  be  unknown.  e(t)’s  are  error  random 
variables,  they  are  assumed  to  be  independent  and  identically  distributed  (i.i.d.)  with  zero 
mean  and  finite  variance  cr2.  The  problem  is,  given  a  sample  of  size  n,  y(ti), . . .  J/(tn)? 
estimate  p,  au . . .  ap  and  also  estimate  ft, . . .  ft.  Note  that  without  loss  of  generality  we 
can  assume  ft  <  ft  <  . . .  <  ft.  The  ordering  may  be  needed  in  the  identification  of  the 
estimates.  We  also  assume  p  <  This  assumption  is  realistic,  infact  in  practice  usually  n 
is  quite  large  compared  to  p. 

This  is  a  very  old  problem  and  well  known  problem  in  statistical  literature.  The 
first  reference  of  this  work  goes  back  to  the  pionerring  work  of  Prony  in  1795.  After 
that  several  articles  have  been  appeared  mainly  to  establish  some  iterative  procedures  to 
obtain  the  least  squares  estimators  (LSE),  when  ‘p’  is  known.  See  for  example  Barham 
and  Drane  (1972),  Golub  and  Pereyra  (1973),  Osborne  (1975,  1976),  Osborne  and  Smyth 
(1986,  1994),  Kahn  et  al.  (1992).  It  is  well  known  that  this  problem  is  a  numerically 
difficult  problem.  Recently  it  is  observed  by  Osborne  and  Smyth  (1994)  that  the  method 
originally  proposed  by  Osborne  (1975)  can  be  very  effective,  provided  the  initial  value 
is  reasonably  good.  Infact  it  is  observed  that  all  the  methods  are  quite  sensitive  to  the 
initial  value.  Unfortunately  no  where  in  the  literature  except  the  work  of  Prony  (1795) 
the  choice  of  the  initial  value  has  been  suggested.  Infact  we  will  see  that  in  many  cases 
the  Prony’ s  estimators  can  not  be  used  as  an  initial  guess.  We  discuss  a  method  which 
can  be  used  to  obtain  the  initial  guess  of  any  one  of  the  standard  algorithm.  We  also 
propose  four  different  types  of  the  confidence  intervals  and  compare  their  performances 
through  Monte  Carlo  simulations.  We  also  consider  the  estimation  of  ‘p\  We  propose  to 
use  the  information  theoretic  criteria  and  the  Cross  Validation  technique  in  this  purpose. 
We  apply  these  methods  to  one  real  life  data  set  and  see  how  the  proposed  methods  work 
in  practice. 

The  organisation  of  the  paper  is  as  follows:  Prom  Section  2  to  Section  6,  we  assume 
that  ‘p’  is  known.  In  Section  2,  we  introduce  the  seperable  regression  technique  in  this 
case  as  it  was  originally  proposed  by  Richards  (1961).  We  introduce  the  Prony’s  estimator 
in  Section  3  and  the  proposed  initial  value  estimator  is  presented  in  Section  4.  We  discuss 
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different  confidence  intervals  in  Section  5  and  some  experimental  results  are  presented  in 
Section  6.  Estimation  of  ‘p’  is  considered  in  Section  7  and  the  data  analysis  of  one  data 
set  is  being  done  in  Section  8.  Finally  we  draw  conclusions  of  our  work  in  Section  9. 


2.  LEAST  SQUARES  ESTIMATORS: 

Observe  that  under  the  assumption  of  additive  i.i.d.  error  the  LSE  can  be  obtained  by 
minimi7.ing  the  residual  sums  of  squares; 

(2) 

t=ti  L  jt=i 

with  respect  to  a  and  /9,  where  a  =  (cki,  . . .  ap)  and  (5  =  (/?i, . . . ,  /3P).  Here  R(a,  f3)  can  be 
written  in  a  matrix  form  as  follows; 

R(a ,  /?)  =  (( Y  -  A(f3)a)T  (Y  -  A (0)a)  (3) 

where  Y  =  (y(*i),  .  •  -,y{tn))  and  A(/3)  is  a  n  x  p  matrix  of  the  following  form 


(4) 


Now  observe  that  for  a  fixed  ‘p’,  the  LSE  of  a’s  are 

m  =  lAT(/3)A(ff)}-'AT(m  (5) 

Now  if  we  replace  a((3)  obtained  from  (5)  in  (3),  we  obtain; 

R{&,  P)  =  Q{P)  =  YT(I  -  Pa)Y  (6) 

where  Pa  =  A(/3)[AT(/3)A(/3)]-1AT(/3)  is  the  projection  matrix  on  the  space  spanned 
by  the  columns  of  A ((3).  Therefore  the  LSE  of  (a,  /3)  obtained  by  minimizing  R(a,/3) 
with  respect  to  a,  /?  is  same  as  first  obtaining  the  LSE,  /3,  of  j3  by  minimizing  Q{(3)  with 
respect  to  (3  and  then  obtain  the  LSE  of  a  by  putting  /3  in  place  of  j3  in  (5).  Osborne  (1975) 
considered  the  minimization  of  Q(3)  with  respect  to  /3  and  he  trcinsformed  the  minimization 
problem  to  an  non-finear  eigenvalue  problem  and  proposed  an  iterative  scheme  to  solve  it. 
Recently  it  is  observed  by  Osborne  and  Smyth  (1994)  that  the  iterative  scheme  originally 
proposed  by  Osborne(1975)  is  numerically  stable,  i.e.,  if  the  initial  value  of  the  iterative 
procedure  is  close  to  the  true  value  (within  the  radius  of  convergence)  then  the  iterative 
procedure  converges  almost  surely. 

3.  PRONY’S  ESTIMATOR: 
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Prony  observed  the  following  (see  for  example  Barrodale  and  Olesky;  1981).  If 

=  ake?kt  for  t  =  t\, . . . ,  tn  (7) 

then  there  exists  (p+1)  constants,  gp+\,  such  that 

9\liu  +  92fki+l  +  •  •  •  +  gp+iHti+p  =  0  for  i  =  1,  (8) 

Here  g  =  (gi, . . . , gp+i)  is  unique  upto  a  constant  multiplication  and  therefore  without  loss 
of  generality  we  can  assume  |g|  =  9i  =  1-  Moreover  it  can  be  shown  that  if  h  = 
then  e^h, . . . ,  e®fh  are  the  roots  of  the  following  polynomial  equation: 

B(x)  =  gi+  92%  +  •  •  •  +  9p+  i*p+1  =  0  (9) 

Therefore  there  exists  a  one  to  one  correspondence  between  g,  such  that  |g|  —  1  and 
g !  >  0,  with  the  nonlinear  parameter  0.  Interestingly  g  is  independent  of  a  (see  Tufts  and 
Kumaresan;  1982).  Observe  that  (8)  can  be  written  in  the  following  form; 


Mg  =  0  (10) 

where  M  is  the  n—p  xp+1  matrix  of  p’s.  The  rank  of  the  matrix  M  is  p  if  n—p  >  p+1  and 
g  is  the  eigenvector  corresponding  to  the  zero  eigenvalue  of  M^M.  This  result  can  be  used 
to  obtain  an  initial  estimator  of  0's  from  the  data.  The  idea  is  as  follows.  Assume  that 
the  error  variance  is  small  then  form  a  matrix  Y  from  M  by  replacing  by  yt  (assuming 
E(yt)  =  yt)  for  ti, . . . ,t„.  Then  obtain  g  =  (&, . . . ,gp)  as  the  eigenvector  corresponding 
to  the  minimum  eigenvalue  of  YTY.  Normalize  g  s.t.  |g|  =  1  and  gi  >  0.  Construct  the 
polynomial  equation 

B(x)  =  gi  +  <72*  +  . . .  +  gP+ i*p  =  0  (11) 

and  obtain  the  roots  of  the  polynomial  (11)  of  the  form  e^lh, . . . ,  (0ph.  Note  that  we  can 
always  assume  0\  <  02  <  ...  <  0P,  because  it  is  simply  renaming  the  estimators  and  then 
the  0i,..., 0P  can  be  used  as  estimators  of  0i, ..  .,0P.  It  usually  behaves  quite  well  as  an 
initial  guess  provided  the  error  variance  is  small.  If  the  error  variance  is  large  then  there 
are  usually  two  kinds  of  problems.  It  may  be  possible  that  some  roots  of  the  polynomial 
equation  (11)  are  complex  or  some  roots  may  be  negative.  In  both  the  cases  it  is  not 
possible  to  obtain  the  estimators  0\,..., 0p.  Osborne  (1975)  used  these  estimators  as  the 
initial  value  for  his  numerical  experiments.  In  our  numerical  experiments  we  observed  that 
Prony’s  estimator  does  not  exist  in  some  situations. 

4.  MODIFIED  PRONY  ESTIMATOR: 

Prony’s  idea  has  been  used  extensively  in  the  Signal  Processing  literature  to  estimate 
the  frequencies  of  an  undamped  exponential  model.  See  for  example  the  work  of  Ulrych 
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and  Clayton  (1976),  Kundu  (1994,  1995),  Kundu  and  Mitra  (1995a,  1995b),  Tufts  and 
Kumaresan  (1982).  Several  modifications  have  been  suggested  by  the  above  authors  for 
the  complex  valued  signals.  Unfortunately  these  methods  can  not  be  applied  in  the  case 
of  real  data  sets.  In  this  section  we  propose  a  new  estimator  which  is  a  modification  of  the 
existing  Prony’s  estimator,  and  it  can  be  used  quite  effectively  as  an  initial  estimator  for 
different  algorithms. 

First  consider  the  same  model  as  (1)  and  yt  and  are  defined  same  as  in  Section  3. 
Now  choose  L,  such  that,  p  <  L  <  n  —p.  It  can  be  shown  similarly  as  Tufts  and  Kumaresan 
(1982)  that  there  exists  bo,...,bi  such  that 

boPti  +  b\PtM  +  . . .  +  bipti+L  =  0  for  i  —  1, . . .  ,n  —  L.  (12) 

where  J2f=i  b*  =  1  and  e^h, e^h  are  the  p  roots,  among  the  L  roots,  of  the  following 
polynomial  equation; 

&o  +  b\X  +  —  +  bixL  =  0  (13) 

Now  we  can  write  (12)  in  the  matrix  form  as; 

Ab  =  0  (14) 

Since  the  rank  of  the  matrix  A  is  p,  therefore  b  is  not  unique  as  long  as  L  >  p  and 
L  + 1  <  n  —  L.  Observe  that  (14)  is  equivalent  to 

ATAb  =  0  (15) 

Consider  the  spectral  decomposition  of  ATA  as  follows; 

L+l 

AtA  =  A^a?  (16) 

i=i 

where  Ai  >  A2  >  . . .  >  A^+i  and  a;’s  are  the  orthonormal  eigenvectors  corresponding  to 
A».  Since  the  rank  of  ATA  is  p,  this  implies  Ap+i  =  . . .  =  Xl+i  =  0.  Therefore  any  vector 
b,  which  belongs  to  the  linear  space  spanned  by  [ap+x,  •  •  • ,  ai+1]  will  satisfy  (16).  Let’s 
consider  the  null  space  of  ATA  as  follows; 

N :  a£+i]  (17) 

Now  since  the  rank  of  N  is  L—p+ 1,  from  (8)  it  follows  that  there  are  (L-p-t-1)  independent 
vectors  of  the  following  form; 
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9i 

0 

0  ' 

• 

9\ 

0 

9p+ 1 

9p 

0 

0 

9p+ 1 

0 

0 

0 

0 

• 

• 

...  91 

0 

0 

•••  9p+ i  . 

(18) 


such  that  the  space  spanned  by  N  and  Ng  are  equal.  This  idea  can  be  used  to  estimate  g 
from  yt’s  as  follows. 


^  A  pm  ^  A 

Replace  /Vs  by  yt’s  to  form  the  matrix  A,  obtain  the  spectral  decomposition  of  A1  A 
as  follows; 

L+ 1 

ArA  =  2  AAif;  (19) 


i=l 


A  A  A 

where  Ai  >  . . .  >  A/,  and  a;  are  the  orthonormal  eigenvectors  corresponding  to  A*.  Although 
the  rank  of  A^A  is  p,  but  the  rank  of  ArA  is  L  +  1.  Obtain  the  estimation  of  the  null 
space  of  N  as; 


N  =  [ap+i  : . . . :  kL+1[ 


(20) 


Now  we  would  like  to  obtain  a  basis  of  N  of  the  form  (18),  which  can  be  used  to 
estimate  g.  Let’s  write 

"Ni 


N  = 


N2 


(21) 


A  ^  A 

where  Ni  is  a  p  +  1  x  L  +  1  —  p  matrix  and  N2  is  a  L  —  p  x  L  +  1  —  p  matrix.  Since  N2 
is  of  rank  L—p,  therefore  there  exists  an  unique  vector  b1  upto  a  constant  multiplication 
such  that  N2bJ  =  0.  Therefore  if  we  denote  g1  =  Nib1,  then; 


Nb1  = 


(22) 


Proceeding  in  the  same  manner,  we  can  say  that  there  exists  vectors  b1 , . . . ,  bL  p+1, 
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such  that; 


N 


:  bL~p+1 


g1 

0 

0  ' 

0 

g2 

0 

0 

0 

gL-P+1 

(23) 


Now  any  one  or  all  or  a  subset  of  gfc,  for  k  =  1, . . . ,  L  —  p  +  1,  can  be  used  to  estimate 
g.  It  is  important  to  note  that  when  E(yi)  =  yi,  for  i  =  1, then  gk  =  g,  for 
k  =  1, . . . ,  L  —  p  +  1.  Now  we  choose  that  particular  g*  from  the  L  —  p  +  1,  g’s  so  that 
it  ‘fits’  the  data  best,  i.e.  for  each  gfc;  1  <  k  <  L  —  p  +  1,  we  obtain  the  estimates  of 
(«i, . . . ,  ap)  and  (/3j, . . . ,  j3p)  and  then  calculate  the  residual  sums  of  squares.  We  choose 
that  set  of  a’s  and  /3’s  for  which  the  residual  sums  of  squares  is  minimum.  Although  we 
may  get  some  infisible  roots  with  respect  to  some  gfc,  but  it  is  observed  that  not  all  of 
them  will  give  infisible  roots. 


The  method  can  be  easily  applied  to  the  following  model; 

yt=ao  +  J2  «Jte4<  +  et  (24) 

k= 1 


that  is  when  the  constant  term  is  also  present.  We  apply  the  above  procedure  to  the  data 
set  yt  —  y,  when  y  —  £"=1  ^  if  it  is  known  that  all  the  /3’s  are  negative.  Otherwise  we 
apply  the  above  method  to  the  complete  model  namely 

yt  =  '$2  ai<e0kt + e<  (25) 

fc=0 


and  then  put  /3o  =0. 

Another  important  aspect  is  the  choice  of  L.  Although  in  theory  L  can  be  any  integer 
satisfying  p  <  L  <  n  —  p  but  it  is  observed  that  the  performance  of  the  modified  Prony 
estimator  depends  on  L.  It  is  observed  (numerically)  that  as  L  increases  at  the  beginning 
the  performance  of  the  modified  Prony  estimator  increases  in  terms  of  the  mean  squared 
errors  but  after  that  it  starts  decreasing.  It  is  observed  that  for  L  «  |  the  performance 
is  the  best,  although  no  theoretical  justifications  can  be  provided.  It  seems  more  work  is 
needed  in  this  direction. 

5.  CONFIDENCE  INTERVALS: 

In  this  section  we  propose  three  different  confidence  intervals  other  than  the  classical 
one  as  is  available  in  the  literature  (Gallant;  1987,  page  105). 
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One  important  aspect  of  this  model  is  its  asymptotic  properties,  namely  the  consistency 
and  the  asymptotic  normality.  If  we  take  a  particular  case  of  this  model,  namely  tj  =  *, 
for  i  =  1, . . .  n  and  p  -  1,  £*i  =  1,  then  it  follows  from  Wu  (1981)  that  any  estimator  of  8 
is  inconsistent.  Infact  to  obtain  the  consistency  and  the  asymptotic  normality  properties 
we  have  to  makp  the  boundedness  assumptions  on  t,’s  and  we  have  to  rewrite  the  model 
as  that  of  Kundu  (1994b).  Under  the  same  set  of  assumptions  as  of  Kundu  (1994b),  we 
have  the  following  result. 

If  we  denote  9  =  (aj, . . . ,  olp,  B\, . . . ,  / 3P ),  and  9  to  be  the  LSE  of  9  of  the  model  (1)  and 
9q  be  the  true  parameter  value  of  9  in  (1),  then  we  can  say; 

Theorem  1  The  LSE  9  of  9  of  the  model  (1)  is  strongly  consistent  and 

xfc(9  -  90 )  ->  N{ 0,  a2 A’1)  (26) 

where  a1  is  the  error  variance,  A  =  ((ay)),  with 

,  =  /‘fc'((M)y(0,t)*  (27) 

u  —  CL  J  a 

where 

h'i{9,t)  =  h(9,t ),  and  h(9,t)  =  ^ake0kt  (28) 

t>0i  1 

Here  a  and  b  are  the  lower  and  the  upper  bound  of  the  sampling  times.  In  our  case  we 
talfp  a  =  t\  and  b  =  tn.  Now  we  can  use  (26)  to  obtain  100(1  —  a)%  confidence  interval 
of  9.  Interested  readers  may  refer  to  Kundu  and  Mitra  (1996)  for  a  detailed  proof  of  the 
above  theorem. 

We  also  propose  two  Bootstrap  confidence  intervals.  One  is  the  percentile  Bootstrap 
and  the  other  one  is  the  Bootstrap-t  confidence  interval.  Percentile  Bootstrap  confidence 
interval  can  be  described  as  follows: 

[1]  From  the  sample  yt  ;  t  =  tj, . . . ,  tn,  estimate  a’s  and  /?’ s. 

[2]  Estimate  et  =  yt  —  where 

tit  =  die^jt 


[3]  Draw  a  random  sample  of  size  n  from  {etl , . . . , et„},  let  it  be  {e#,, . . . ,  £#„}. 

[4]  Obtain  the  Bootstrap  sample  yt* , . . . , tit „'■>■>  where 

V*u  =  tiu  +  et.  for  i  =  1, . . . ,  n 
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[5]  Estimate  a’s  and  /3’s  from  yt”s  and  denote  it  by  a*’s  and  3,1s. 

[6]  Repeat  the  process  (3)  to  (5)  NBOOT  times. 

[7]  From  the  NBOOT  estimators  obtain  the  lower  and  the  upper  bound  of  the  100(1  - 
a%  percentile  Bootstrap  confidence  interval  for  each  a*  and  3k- 

We  propose  the  following  algorithm  for  computing  the  Bootstrap-t  confidence  intervals. 

[1]  From  the  sample  yt  ;  t  =  ii, . . . ,  t„,  estimate  a’s  and  /Ts. 

[2]  Estimate  et  =  yt  —  yt,  where 

3= 1 

for  t  —  tj) . . .  ,  ijj. 

[3]  Draw  a  random  sample  of  size  n  from  { etl , . . . ,  ein},  let  it  be  {e#,, . . . ,  e#n}. 

[5]  Estimate  a’s  and  /3’s  from  y*t  and  also  estimate  a 2  as  a\  by  mean  residual  sums  of 
squares. 

[6]  For  each  a  or  (3  (say  0),  compute 

T  _  Vn(g*  -  6) 

&B 

[7]  Repeat  the  steps  (3)  to  (6)  NBOOT  times. 

[8]  From  the  NBOOT  estimators,  obtain  the  lower  and  upper  bound  of  100  (1  -  a)% 
Bootstrap-t  confidence  intevals  for  each  a*  and  3k- 

In  the  next  section  we  have  performed  some  numerical  experiments  to  see  how  the 
different  methods  behave  for  finite  sample. 

6.  NUMERICAL  EXPERIMENTS  AND  DISCUSSIONS: 

In  this  section  we  perform  some  numerical  experiments  mainly  to  see  how  the  Modified 
Prony’s  estimator  behaves  compared  to  the  usual  Prony  estimator  as  an  intial  guess  for 
different  iterative  procedures.  We  compare  Osborne’s  method,  which  is  well  known  to  work 
very  good  for  this  problem,  with  that  of  the  usual  Gauss  Newton  method.  We  also  compare 
different  confidence  intervals  with  respect  to  their  coverage  probability  and  average  length. 
We  consider  the  following  model; 


yt  =  — G.Oe-'232*  +  3.0e0119t  +  e*; 


t  =  1, . . .  ,n 


(29) 
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et  is  taken  to  be  i.i.d.,  normal  random  variables  with  mean  zero  and  finite  variance.  We 
take  the  values  of  n  as  25,  50,  75  and  100  and  the  different  values  of  cr  are  .01,  .05  and  .1. 
For  each  n  and  <r,  we  generate  500  samples  and  compute  the  Prony  estimators  (PE)  and 
the  modified  Prony  estimators  (MPE).  Then  using  the  PE  and  MPE  as  the  initial  guesses 
we  estimate  a’s  and  /?’ s  using  Gauss  Newton  (GN)  and  the  Osborne’s  (1975)  (OS)  method. 
We  compute  the  mean  squared  errors  (MSE)  of  the  different  estimators.  We  also  report 
the  median  number  of  iteration  counts  and  the  maximum  number  of  iteration  taken  over 
500  replications.  We  also  report  the  the  number  of  times  the  PE  does  not  give  the  feasible 
estimators. 

We  also  compute  the  confidence  intervals  of  the  different  parameters  by  four  different 
methods  namely  the  classical  one  ,  the  proposed  one  and  the  two  Bootstrap  confidence 
intervals.  We  compute  the  average  length  and  the  coverage  probability.  We  take  the 
nominal  level  to  be  90%  in  our  experiments.  We  report  all  the  results  in  Tables:  1-9.  In 
Table  1-4,  we  report  the  average  estimates  and  the  MSE’s  of  the  PE,  MPE  and  the  LSE. 
In  each  box,  the  first  row,  the  second  row  and  the  third  row  indicate  the  results  of  the  PE, 
MPE  and  LSE  respectively.  In  each  row  the  first  figure  represents  the  average  value  of 
the  corresponding  estimator  and  in  the  bracket  we  present  the  corresponding  MSE.  Table 
5  represents  the  comparison  of  the  OS  method  with  that  of  the  GN  method  when  the 
starting  value  is  different.  We  also  present  the  median  number  of  iteration  count  and  the 
maximum  iteration  taken  for  both  OS  method  and  GN  method,  when  the  initial  guess 
is  the  usual  PE  and  also  when  the  initial  guess  is  the  MPE.  In  each  box,  the  first  figure 
represents  the  median  iteration  count  and  in  bracket  the  figure  indicates  the  maximum 
number  of  iterations  required  over  500  replications  for  GN  method.  Similarly  the  second 
figure  reprents  the  result  for  OS  method.  In  Tables  6-9,  we  present  the  comparison  of 
the  performances  of  the  four  different  confidence  intervals.  In  each  table  we  present  the 
average  length  over  500  replications  and  in  bracket  we  present  the  coverage  probability. 

From  the  Tables  1-4,  it  is  observed  that  as  sample  size  increases  or  the  error  variance 
decreases  the  MSE  of  all  the  estimators,  namely  PE,  MPE  and  LSE,  decrease.  It  is  clear 
that  MPE  behave  better  than  PE  and  the  LSE  behave  better  than  MPE  in  all  the  cases 
in  terms  of  lower  MSE.  In  our  simulations  it  is  observed  that  although  PE  does  not  exist 
in  certain  cases  but  MPE  always  exist.  It  is  observed  that  PE  may  not  exist  if  the  sample 
size  is  small  and  the  error  variance  is  large.  If  the  sample  size  is  large  or  the  error  variance 
is  small  PE  exist.  From  Table  5,  it  is  observed  that  as  sample  size  increases  or  the  error 
variance  decreases,  the  median  number  of  iteration  count  or  the  maximum  number  of 
iteration  count  decreases  in  all  the  cases.  It  is  also  observed  that  the  required  number  of 
iteration  is  much  less  if  the  MPE  are  used  as  the  initial  guess  to  obtain  the  LSE  and  it 
is  very  prominent  if  the  error  variance  is  high.  It  is  also  observed  that  between  the  OS 
method  and  the  GN  method  the  OS  method  takes  less  number  of  iterations  compared  to 
GN  method  in  all  the  cases  considered.  It  indicates  that  it  is  better  to  use  the  OS  method 
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rather  than  the  GN  method  in  general.  Although  if  we  use  the  MPE  as  the  initial  guesses, 
it  does  not  make  much  differences. 

About  the  different  confidence  intervals,  it  is  observed  that  they  behave  quite  differently 
in  terms  of  the  average  confidence  length  and  the  coverage  probability.  It  is  observed  that 
as  sample  size  increases  or  the  error  variance  decreases  the  average  length  of  the  confidence 
intervals  decreases  in  all  the  four  cases.  It  is  also  observed  that  as  sample  size  increases  the 
coverage  probability  becomes  closer  to  the  nominal  value.  Among  the  different  methods  it 
is  observed  that  the  classical  method  can  not  maintain  the  nominal  coverage  probability 
in  many  situations,  where  as  the  asymptotic  method  or  the  Bootstrap  procedure  behaves 
better  than  the  classical  method  in  the  sense  that  the  coverage  probability  is  higher  than 
that  of  the  classical  one.  Although  in  certain  situations  it  also  can  not  maintain  the 
nominal  coverage  probability  and  in  certain  situations  the  coverage  probabilty  is  much 
higher  than  the  nominal  level.  Bootstrap-t  method  works  better  in  that  sense,  as  all  the 
cases  considered  it  is  capable  of  maintaining  the  nominal  level  although  the  average  length 
is  higher  in  most  of  the  cases.  It  is  also  computer  intensive.  Comparing  all  these  we 
recommend  to  use  Bootstrap-t  and  if  we  want  to  avoid  heavy  computation,  we  may  use 
the  asymptotic  confidence  interval. 


7.  ESTIMATION  OF  p: 


In  this  section  we  propose  different  methods  to  estimate  p.  We  assume  that  p  <  M, 
where  M  is  some  fixed  unknown  integer.  We  propose  to  use  Akaike  Information  Theoretic 
Criteria  (AIC)  and  Bayes  Information  Theoretic  Criteria  in  this  setup  as  it  was  originally 
proposed  by  Akaike  (1971),  Schwartz  (1978)  and  Risannen  (1978)  in  the  general  model 
selection  setup.  See  Kundu  and  Murali  (1996)  for  the  comparison  of  the  different  methods. 
Rao  (1986)  also  suggested  to  use  different  information  theoretic  criteria  to  estimate  the 
number  of  signals  in  a  similar  problem  in  Signal  Processing.  See  for  example  Kundu  (1993) 
and  Mitra  and  Kundu  (1996)  in  this  connection.  AIC  and  BIC  take  the  following  form  in 
this  case: 


AIC(k)  =  n  In Rl  +  2  x  number  of  free  parameters  (30) 

BIC(k)  =  n  In  Rl  +  ^ln  n  x  number  of  free  parameters  (31) 


AIC  (BIC)  chooses  that  k  for  which  (30) [(31)]  is  minimum.  Here  R\  is  the  residual 
sums  of  squares  when,  the  model  order  is  k,  i.e. 


(32) 


t= 1 


i=l 


We  can  use  the  Cross  Validation  technique  also  to  estimate  p.  We  propose  the  following 
procedure  for  the  Cross  Validation. 
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[1]  Take  p  =  1. 

[2]  From  the  data  ytl , . . . ,  yt„,  delete  ytk\  k  =  1, . . . n. 

[3]  Estimate  a’s  and  ,/3’s  from  ytl, . . . ,  ytk_„ytk+1,  -  -  -  ytn  by  using  the  missing  value  tech¬ 
nique  of  Kundu  and  Kundu  (1994). 

[4]  Estimate  ytk  by  ytk  and  obtain  the  Cross  Validatory  error 

Y.(ytk  -  Vtk ?■ 

k= 1 

[5]  Repeat  (2)  to  (4)  for  different  values  of  p  =  1, 2, . . . ,  M,  and  choose  that  value  of  ‘p’ 
for  which  the  Cross  Validatory  error  is  minimum. 


In  the  next  section  we  use  these  techniques  to  determine  the  order  of  the  model  for  one 
real  life  data  set. 

8.  DATA  ANALYSIS: 

In  this  section  we  consider  one  real  life  data  set  from  Osborne  (1972,  page  185).  The 
data  are  reporeted  in  the  following  form  ( t,yt ): 

(0,.844),  (10, .908),  (20, .932),  (30, .936),  (40, .925),  (50, .908),  (60, .881),  (70, .850),  (80, .818), 
(90, .784),  (100, .751),  (110, .718),  (120,-685),  (130, .658),  (140, .628),  (150, .603),  (160, .580), 
(170,-558),  (180,.538),  (190,-522),  (200, .506),  (210, .490),  (220, .478),  (230, .467),  (240, .457), 
(250, .448),  (260,.438),  (270, .431),  (280, .424),  (290, .420),  (300, .414),  (310, .411),  (320, .406). 

Observe  that  in  this  case  n  =  33,  h  =  10.0,  a  =  t\  —  0.0  and  b  =  £33  =  320.  From 
the  plot  it  is  clear  that  the  sum  of  exponentials  can  be  tried  to  fit  the  data.  First  we  use 
different  order  determination  criteria  as  it  is  proposed  in  Section  7.  Different  information 
theoretic  criteria  take  the  following  form: 


Model 

Ri 

NOP 

AIC 

BIC 

<*o 

1.1529 

2 

8.6952 

8.1918 

a.\e0lt 

.05105 

3 

-92.1733 

-92.9286 

ao  +  aieAt 

.05089 

4 

-90.2769 

-91.2839 

EL.a*e«*‘ 

.05082 

5 

-88.3223 

-89.5810 

ao  +  EL.a^*' 

.553E-4 

6 

-311.4900 

-313.0008 

EL. 

.550E-4 

7 

-309.6699 

-311.4320 

ao  +  ELi  a*efl*‘ 

^525E-4 

8 

-309.0169 

-311.0310 
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Therefore  it  is  observed  from  the  above  table  that  AIC  and  BIC  choose  the  following 
model: 

yt  =  q0  +  a\e0lt  +  a2eht  +  et  (33) 

Observe  that  Osborne  (1975)  and  Osborne  and  Smyth  (1995)  also  use  the  same  model  to 
fit  this  data.  They  did  not  mention  why  they  have  used  that  model.  Osborne  used  the 
initial  estimates  as  ao  =  .5,  <*i  =  1.5,  a2  =  -1.0,  /3i  —  -.01,  f32  =  -.02.  He  did  not  mention 
how  he  had  obtained  those  initial  estimates.  With  those  initial  estimates  he  obtained  the 
following  least  squares  estimates:  a0  =  .3754,  ot\  —  1.9358,  a2  =  -1.4647,  (3\  =  -.01287,  f32 
=  -.02212  after  27  iterations.  We  obtain  the  Modified  Prony  estimates  with  L  =  11,  as 
follows  in  this  case:o:o  =  .1993,  c*i  =  1.3670,  a2  —  -.7157,  =  -.0087,  fl2  —  -.0261.  Using 

these  values  as  the  initial  estimates  the  usual  Gauss  Newton  algorithm  converges  after  5 
iterations  to  the  same  value  as  that  of  Osborne.  The  Osborne’s  method  takes  4  iterations 
to  converge  in  this  case.  The  original  data  and  the  fitted  curve  are  shown  in  Figure  la.  The 
observed  and  the  fitted  values  are  shown  in  Figure  lb.  We  obtain  the  following  residuals: 

-2.499E-03,  4.611E-03,  1.171E-03,  -8.564E-04,  -2.649E-03,  7.331E-05,-2.650E-04, 

-3.521E-04,  8.159E-04,  7.785E-04,  1.487E-03,  1.209E-03,-5.484E-04,  1.902E-03, 

-6.150E-04,-1.783E-04,  2.069E-04,-4.138E-04,-9.608E-04,  6.685E-04,  5.900E-04, 

-1 .073E-03,-l .975E-04,  3.402E-04,  6.586E-04,  8.703E-04,-9.186E-04,  -6.096E-04, 

-1.111E-03,  6.592E-04,-2.205E-04,  1.318E-03,  3.384E-04. 

We  perform  run  test  (Draper  and  Smith;  1981,  page  159)  and  Durbin- Watson  test 
(Draper  and  Smith;  1981,  page  162)  on  the  residuals.  It  is  observed  that  z  =  .3594  and 
d  —  2.0437  for  rim  test  and  Durbin- Watson  test  respectively.  Therefore  both  the  tests 
confirm  the  independent  assumptions  of  the  error  random  variables. 

9.  CONCLUSIONS: 

In  this  paper  we  consider  the  linear  compartment  model  as  it  is  defined  in  (1).  We 
propose  MPE  to  obtain  the  initial  estimators,  which  is  a  very  important  problem.  It  is 
observed  that  the  proposed  initial  estimator  can  be  used  for  any  standard  algorithm  to 
obtain  the  LSE’s.  We  observe  that  although  PE  may  not  exist  in  certain  situation  but 
MPE  always  exist.  We  compare  the  usual  GN  method  with  that  of  the  OS  method  and 
it  is  observed  that  the  OS  method  works  very  well  compared  to  that  of  GN  if  the  initial 
guess  is  PE  but  if  we  use  the  MPE  as  the  initial  guesses  it  does  not  make  much  difference, 
i.e.  we  donot  need  any  special  purpose  algorithm  to  solve  this  particular  problem.  We  also 
propose  different  confidence  intervals  and  it  is  observed  that  the  Bootstrap-t  works  the 
best  in  terms  of  the  required  coverage  probability.  We  consider  the  problem  of  estimating 
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the  number  of  terms  in  the  linear  compartment  model.  We  propose  to  use  AIC  or  BIC 
to  estimate  the  number  of  terms:  We  also  provide  the  scheme  to  estimate  the  number  of 
terms  by  the  Cross  Validation  technique.  It  seems  any  one  of  them  can  be  used.  Since  the 
Cross  Validation  is  more  computer  intensive  AIC  or  BIC  may  be  preferred. 
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Table  1 

Average  Estimates  and  the  Mean  Squared  Errors  of  the  Different  Estimators 

N  =  25. 


a 

<*i 

<*2 

A 

0 2 

.01 

-6.008(.0071) 

-6.005(.0037) 

-6.000(.0002) 

2.999(.0087) 

2.999(.0064) 

3.000(.0003) 

-,2323(6.96E-5) 

-,2321(5.04E-5) 

-. 2320(2. 62E-6) 

.0119(2. 16E-6) 

. 0119(1. 53E-6) 
.0119(7.34E-8) 

.05 

-6.172(.2921) 
-6.120(.1308) 
-6.005 (.0065) 

3.135(.3520) 

3.044(.1980) 

3.002(.0083) 

-. 2354(1. 83E-3) 

-. 2349(1. 34E-3) 
-.2324(6.49E-5) 

.0114(6.22E-5) 

,0117(3.83E-5) 

.0119(1.86E-6) 

.1* 

-6.854(1.321) 

-6.321(.9316) 

-6.020(.0290) 

3.297(2.321) 

3.147(1.231) 

3.012(.0366) 

2821(1. 23E-2) 
-,2524(5.37E-3) 

-. 2329(2. 59E-4) 

.0120(6.78E-4) 

.0121(1.40E-4) 

.0118(7.69E-6) 

*  The  ordinary  Prony  Estimator  did  not  exist  25  times,  so  the  results  of  the  Prony  Esti¬ 
mators  are  based  on  the  average  of  475  replications. 


Table  2 

Average  Estimates  and  the  Mean  Squared  Errors  of  the  Different  Estimators 

N  =  50. 


a 

ai 

«2 

A 

A 

.01 

-6.006(.0070) 
-6.000 (.0030) 
-6.000 (1.49E-4) 

3.002(.0025) 

2.999(.0006) 

3.000(2.44E-5) 

.05 

-6.118(.2038) 

-6.p34(.0637) 

-6.001(.0038) 

3.069(.0695) 

2.977(.0182) 

3.000(6.07E-4) 

-.2354(1. 31E-3) 
-.2383(7. 14E-4) 
-.2322(2. 26E-5) 

. 0117(5. 78E-6) 
.0121(1.47E-6) 
.0119(4.91E-8) 

B 

-6.304(1.179) 

-6.168(.3166) 

-6.003(.0151) 

3.236(.7069) 

2.970(.0671) 

3.000(.0024) 

-.2365(5.62E-3) 
-.2510(3. 05E-3) 
-.2325(9. 16E-5) 

.0109(3.34E-5) 
.0122(5. 14E-6) 
.0119(1.96E-7) 

*  The  ordinary  Prony  Estimator  did  not  exist  11  times,  so  the  results  of  the  Prony  Esti¬ 
mators  are  based  on  the  average  of  489  replications. 
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Table  3 

Average  Estimates  and  the  Mean  Squared  Errors  of  the  Different  Estimators 

N  =  75 


a 

«i 

0-2 

A 

A 

.01 

-5.998(7. 18E-3) 
-6.003(2. 53E-3) 
-6.000(1.49E-4) 

2.998(1. 91E-3) 
2.999(2. 68E-4) 
3.000(6.47E-6) 

-,2323(4.80E-5) 

-,2324(2.30E-5) 

-.2320(4.91E-7) 

.0119(7.82E-8) 

.0119(1.06E-8) 

.0119(2.38E-10) 

.05 

-6.071(.2001) 

-6.055(.0571) 

-6.000(.0037) 

3.041  (.0532) 

2.991(6.55E-3) 

3.000(1.60E-4) 

-.2356(1.25E-3) 

-.2373(5.53E-4) 

-.2322(1.24E-5) 

,0119(2.10E-6) 

.0119(2.60E-7) 

.0119(5.92E-9) 

.1 

-6.157(1.127) 

-6.244(.2670) 

-6.002(.0151) 

3.125(.3311) 

2.986(.0241) 

3.000(6.34E-4) 

-.2370(5.45E-3) 

-,2521(2.42E-3) 

-.2324(5.03E-5) 

.0117(1.10E-5) 

.0120(9.55E-7) 

.0119(2.35E-8) 

Table  4 

Average  Estimates  and  the  Mean  Squared  Errors  of  the  Different  Estimators 

N  =  100 


a 

ai 

a2 

A 

P2 

.01 

-5.998(7.07E-3) 
-6.001(2. 11E-3) 
-6.000(1.49E-4) 

2.998(1. 15E-3) 
3.000(1.40E-4) 
3.000(3.77E-6) 

-,2323(4.63E-5) 

-,2326(2.23E-5) 

-.2320(4.43E-7) 

.0119(2.53E-8) 

.0119(2.94E-9) 

.0119(7.12E-11) 

.05 

-6.059(.1959) 
-6.102(.0411) 
-6.000 (.0033) 

3.030(.0305) 
3.004(3. 72E-3) 
3.000(9.41E-5) 

-.2360(1.21E-3) 

-.2389(5.48E-4) 

-.2322(1.12E-5) 

.0120(6.61E-7) 

.0118(7.94E-8) 

.0119(1.77E-9) 

B 

-6.103(1.115) 

-6.349(.2121) 

-6.002(.0152) 

3.074(.2213) 
3.004(.0129) 
3.000(3. 75E-4) 

-.2386(5.21E-3) 

-.2577(1.96E-3) 

-.2324(4.53E-5) 

.0119(3.68E-6) 

. 0119(2. 70E-7) 
.0119(7.08E-9) 

Table  5 

Comparison  of  the  Osborne’s  Method  and  the  Gauss  Newton  Method 
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N 

a 

PRONY 

MODIFIED  PRONY 

.01 

5(8),  5(7) 

2(3),  2(3) 

25 

.05 

7(D),  6(9) 

3(9),  3(9) 

.1 

15(100),  12(88) 

4(41),  4(36) 

.01 

4(8),  4(6) 

2(2),  2(2) 

50 

.05 

6(9),  5(9) 

3(3),  3(3) 

.1 

12(70),  10(70) 

.  3(5),  3(4) 

.01 

4(6),  4(6) 

2(2),  2(2) 

75 

.05 

4(9),  4(9) 

2(3),  2(3) 

.1 

10(63),  8(63) 

3(4),  3(4) 

.01 

3(3),  3(3) 

2(2),  2(2) 

100 

.05 

4(5),  4(5) 

2(3),  2(3) 

.1 

7(19),  6(15) 

3(4),  3(4) 

Table  6 

Different  Confidence  Intervals 
N  =  25 


a 

Methods 

<*i 

<* 2 

A 

A 

.01 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.0527(.87) 

.0494(.83) 

,0497(.84) 

.0569(.91) 

.0600(.92) 
.0564(.91) 
.0563(.92) 
.0648 (.92) 

.0057(.93) 

.0051(.90) 

.0051(.91) 

.0058(.93) 

.0009(.92) 

.0009(.91) 

.0009(.89) 

.0009(.93) 

.05 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.2661(.87) 

.2489(.82) 

.2520(.85) 

,2895(.91) 

.3012(.91) 

.2827(.91) 

.2845(.91) 

.3275(.92) 

.0285(.94) 

.0256(.90) 

.0253(.91) 

.0290(.93) 

.0005(.91) 

.0004(.91) 

.0004(.89) 

.0005(.93) 

.1 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.5476(.89) 

.5106(.83) 

.5462(.85) 

.6290(.90) 

.6133(.92) 

.5754(.91) 

.6234(.92) 

.7124(.92) 

.0571 (.93) 
.0512(.90) 
.0527(.93) 
.0604(.92) 

.0092(.91) 
.0086(.91) 
.0091  (.92) 
.0104 (.92) 

Table  7 

Different  Confidence  Intervals 
N  =  50 


a 

Methods 

«2 

A 

A 

.01 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.0493(.92) 

.0399(.90) 

.0404(.91) 

.0429(.91) 

.0148(.92) 

.0147(.88) 

.0148(.93) 

.0159(.91) 

.0030 (.88) 
.0027(.87) 
.0028(.87) 
.0029(.92) 

.0001  (.88) 
.0001  (.88) 
.0001  (.87) 
.0001  (.92) 

.05 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.2419(.92) 

.2001(.90) 

.2023(.92) 

.2151(.92) 

.0739(.88) 

.0734(.86) 

.0738(.84) 

.0793(.91) 

,0151(.88) 

,0139(.86) 

.0139(.81) 

.0148(.90) 

.0007(.88) 

.0007(.87) 

.0007(.87) 

.0007(.93) 

.1 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.4846(.91) 

.4007(.89) 

.4052(.91) 

.4316(.91) 

.1480(.87) 

.1469(.86) 

.1480(.87) 

.1590(.92) 

.0302 (.89) 
,0278(.88) 
.0278(.87) 
.0298(.93) 

.0013(.89) 

.0013(.87) 

.0013(.87) 

.0014(.91) 

Table  8 

Different  Confidence  Intervals 
N  =  75 


(7 

Methods 

«2 

A 

A 

Asymptotic 

.0489(.92) 

.0081  (.88) 

.0025(.91) 

4.94E-5(.88) 

.01 

Classical 

.0404 (.93) 

.0082(.88) 

.0023 (.93) 

4.95E-5(.88) 

Bootstrap 

.0406(.95) 

.0081  (.88) 

.0024(.93) 

4.91E-5(.88) 

Bootstrap-t 

.0422(.91) 

.0085(.91) 

.0025  (.92) 

5.1lE-5(.91) 

Asymptotic 

.2447(.92) 

.0408 (.89) 

.0127(.9l) 

.0002(.89) 

.05 

Classical 

.2021(.92) 

.0411  (.88) 

.0118(.93) 

.0002(.85) 

Bootstrap 

.2030(.92) 

.0405(.88) 

.0118(.93) 

.0002 (.88) 

Bootstrap-t 

.2114(.92) 

.0424 (.89) 

.0123 (.93) 

.0003(.91) 

Asymptotic 

.4900(.91) 

.0816(.88) 

.0255  (.92) 

.0005(.88) 

.1 

Classical 

.4047(.92) 

.0822(.88) 

.0236(.93) 

.0005(.88) 

Bootstrap 

.4069(.95) 

,0813(.88) 

.0237(.93) 

.0005 (.88) 

Bootstrap-t 

.4234(.92) 

.0849(.92) 

.0248 (.93) 

.0005(.91) 

Table  9 

Different  Confidence  Intervals 
N  =  100 


a 

Methods 

<*i 

«2 

Pi 

02 

.01 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.0491(.92) 

.0406(.92) 

.0411(.91) 

.0422(.91) 

.0056(.88) 

,0057(.86) 

.0057(.87) 

,0058(.91) 

.0024(.92) 
.0022(.89) 
.0022(.89) 
.0023  (.90) 

2.51E-5(.88) 

2.52E-5(.87) 

2.49E-5(.87) 

2.58E-5(.90) 

.05 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.2459(.92) 

.2034(.92) 

.2057(.91) 

.2111(.91) 

.0281(.88) 
.0284 (.84) 
.0283(.86) 
.0292(.91) 

.0119(.93) 

.0109(.89) 

.0109(.88) 

.0114(.90) 

.0001  (.88) 
.0001(.87) 
.0001  (.87) 
.0001  (.90) 

.1 

Asymptotic 

Classical 

Bootstrap 

Bootstrap-t 

.4925(.92) 

.4072(.92) 

.4113(.91) 

.4225(.91) 

.0562 (.89) 
,0569(.88) 
.0565 (.88) 
,0583(.90) 

.0239(.92) 
.0220 (.89) 
.0219 (.89) 
.0227(,91) 

.0003 (.89) 
.0003 (.88) 
.0002 (.85) 
.0003(.90) 

